cient Algorithms for Identifying Relevant Features

نویسنده

Hussein Almuallim

چکیده

This paper describes eecient methods for exact and approximate implementation of the MIN-FEATURES bias, which prefers consistent hypotheses deenable over as few features as possible. This bias is useful for learning domains where many irrelevant features are present in the training data. We rst introduce FOCUS-2, a new algorithm that exactly implements the MIN-FEATURES bias. This algorithm is empirically shown to be substantially faster than the FOCUS algorithm previously given in Al-muallim and Dietterich, 1991]. We then introduce the Mutual-Information-Greedy, Simple-Greedy and Weighted-Greedy algorithms, which apply eecient heuristics for approximating the MIN-FEATURES bias. These algorithms employ greedy heuristics that trade op-timality for computational eeciency. Experimental studies show that the learning performance of ID3 is greatly improved when these algorithms are used to preprocess the training data by eliminating the irrelevant features from ID3's consideration. In particular, the Weighted-Greedy algorithm provides an excellent and eecient approximation of the MIN-FEATURES bias.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Boolean Concepts in the Presence of Many Irrelevant Features

In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses de nable over as few features as possible. This paper de nes and studies this bias in Boolean domains. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requires (1 ln 1 + 1 [2p + p lnn]) training examples to guarantee PAC-learning a concept having p relev...

متن کامل

Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection

e proliferation of social media in communication and information dissemination has made it an ideal platform for spreading rumors. Automatically debunking rumors at their stage of diusion is known as early rumor detection, which refers to dealing with sequential posts regarding disputed factual claims with certain variations and highly textual duplication over time. us, identifying trending ...

متن کامل

A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks

The rumor is a collective attempt to interpret a vague but attractive situation by using the power of words. Therefore, identifying the rumor language can be helpful in identifying it. The previous research has focused more on the contextual information to reply tweets and less on the content features of the original rumor to address the rumor detection problem. Most of the studies have been in...

متن کامل

Classification of encrypted traffic for applications based on statistical features

Traffic classification plays an important role in many aspects of network management such as identifying type of the transferred data, detection of malware applications, applying policies to restrict network accesses and so on. Basic methods in this field were using some obvious traffic features like port number and protocol type to classify the traffic type. However, recent changes in applicat...

متن کامل

Novel Randomized Feature Selection Algorithms

Feature selection is the problem of identifying a subset of the most relevant features in the context of model construction. This problem has been well studied and plays a vital role in machine learning. In this paper we present three randomized algorithms for feature selection. They are generic in nature and can be applied for any learning algorithm. Proposed algorithms can be thought of as a ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1992

cient Algorithms for Identifying Relevant Features

نویسنده

چکیده

منابع مشابه

Learning Boolean Concepts in the Presence of Many Irrelevant Features

Call Attention to Rumors: Deep Attention Based Recurrent Neural Networks for Early Rumor Detection

A Model for Detecting of Persian Rumors based on the Analysis of Contextual Features in the Content of Social Networks

Classification of encrypted traffic for applications based on statistical features

Novel Randomized Feature Selection Algorithms

عنوان ژورنال:

اشتراک گذاری